Prediction of protein solvent accessibility using support vector machines.

نویسندگان

  • Zheng Yuan
  • Kevin Burrage
  • John S Mattick
چکیده

A Support Vector Machine learning system has been trained to predict protein solvent accessibility from the primary structure. Different kernel functions and sliding window sizes have been explored to find how they affect the prediction performance. Using a cut-off threshold of 15% that splits the dataset evenly (an equal number of exposed and buried residues), this method was able to achieve a prediction accuracy of 70.1% for single sequence input and 73.9% for multiple alignment sequence input, respectively. The prediction of three and more states of solvent accessibility was also studied and compared with other methods. The prediction accuracies are better than, or comparable to, those obtained by other methods such as neural networks, Bayesian classification, multiple linear regression, and information theory. In addition, our results further suggest that this system may be combined with other prediction methods to achieve more reliable results, and that the Support Vector Machine method is a very useful tool for biological sequence analysis.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of Protein Relative Solvent Accessibility with Support Vector Machines and Long-range Interaction

The prediction of protein relative solvent accessibility gives us helpful information for the prediction of tertiary structure of a protein. The SVMpsi method which uses support vector machines (SVMs) and the position specific scoring matrix (PSSM) generated from PSI-BLAST has been applied to achieve better prediction accuracy of the relative solvent accessibility. We have introduced a three di...

متن کامل

Prediction of protein relative solvent accessibility with support vector machines and long-range interaction 3D local descriptor.

The prediction of protein relative solvent accessibility gives us helpful information for the prediction of tertiary structure of a protein. The SVMpsi method, which uses support vector machines (SVMs), and the position-specific scoring matrix (PSSM) generated from PSI-BLAST have been applied to achieve better prediction accuracy of the relative solvent accessibility. We have introduced a three...

متن کامل

Prediction of Protein Relative Solvent Accessibility with Two-Stage SVM approach

Information on Relative Solvent Accessibility (RSA) of amino acid residues in proteins provides valuable clues to the prediction of protein structure and function. A two-stage approach with Support Vector Machines (SVMs) is proposed, where an SVM predictor is introduced to the output of the single-stage SVM approach to take into account the contextual relationships among solvent accessibilities...

متن کامل

A Comparative Study of Extreme Learning Machines and Support Vector Machines in Prediction of Sediment Transport in Open Channels

The limiting velocity in open channels to prevent long-term sedimentation is predicted in this paper using a powerful soft computing technique known as Extreme Learning Machines (ELM). The ELM is a single Layer Feed-forward Neural Network (SLFNN) with a high level of training speed. The dimensionless parameter of limiting velocity which is known as the densimetric Froude number (Fr) is predicte...

متن کامل

Protein Secondary Structure Prediction Using Support Vector Machines and a New Feature Representation

Knowledge of the secondary structure and solvent accessibility of a protein plays a vital role in the prediction of fold, and eventually the tertiary structure of the protein. A challenging issue of predicting protein secondary structure from sequence alone is addressed. Support vector machines (SVM) are employed for the classification and the SVM outputs are converted to posterior probabilitie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proteins

دوره 48 3  شماره 

صفحات  -

تاریخ انتشار 2002